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Abstract—The AIS broadcasts the ship’s navigation data 
automatically and autonomously through VHF band. It plays an 
important role in collision avoidance and maritime situational 
awareness. However, in inland waterways, it is common that the 
AIS data-link was obstructed by the river bank and mountains, 
or sometimes the AIS was encountered with the electromagnetic 
interference. Consequently, the AIS dynamic data is often lost 
and mixed with inaccurate, which may lead to misjudgment of 
the traffic situation. To address this problem, a method was 
proposed to enhance the availability of AIS data in this paper. 
Firstly, according to the ships’ maneuverability, a set of factors, 
such as the moving distance, speed, acceleration, and course 
change rate, were designed to screen the inaccurate AIS data. 
Then, the piecewise cubic Hermite interpolation and cubic spline 
interpolation were employed to restore the AIS data. A real AIS 
trajectory was introduced to validate the accuracy of the two 
interpolation methods, which proves that the cubic spline 
interpolation performance is better than piecewise cubic Hermite 
interpolation. Field experiments in Wuhan reach of Yangtze 
River show that the method proposed in this paper is highly 
effective. The accuracy of the location is 3.5m, the speed accuracy 
is 0.05m/s, and the course accuracy is up to 0.7 degrees, the AIS 
data can be accurately repaired. 


Keywords—AIS; data availability; ship maneuverability; cubic 
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I. INTRODUCTION 


Due to the advantages of large transportation volume, low 
cost, low energy consumption and less pollution, waterway 
transportation plays an important role in the comprehensive 
transportation system in China. Researches on 
informationization and intelligent system of shipping 
management have always been the concern of the international 
shipping industry, and getting real-time traffic information is 
the key to build the intelligent shipping system [1]. 


Automatic Identification System (AIS) is composed of 
shore station and ship station. The ship station exploits the GPS 
module to obtain the dynamic information of vessels, such as 
position (presented by longitude and latitude), speed over 
ground, course over ground, etc. Meanwhile, the pilot input the 
static information, such as the MMSI, call sign, gross tonnage, 
draft and dimension, etc. into the ship station with keyboard. 
Then, the ship station modulates the dynamic information and 
static information into AIS messages and conveys the AIS 
messages to other ship stations around. According to the IMO 
regulations, the ship station usually sends dynamic AIS 
messages every 6 to 30 seconds when the ship is normal 
voyage [2-5]. The shore station receives and decodes the AIS 
message transmitted by the ship station, and then displays the 
information of the ships on VTS screen. Therefore, the AIS 
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enables the exchange of ship information among vessels and 
maritime administrators. It is helpful for maritime regulation 
and ship collision avoidance. 


AIS is an important means to collect ship traffic 
information at present. AIS messages contain large volumes of 
information, thus the AIS is an important data source for 
analyzing the motion pattern and navigational risk of vessels. 
However, the AIS utilize the VHF band communication, which 
makes the data-link of AIS not reliable. Consequently, the raw 
AIS data is not fully available, i.e. the AIS data is not 
completely correct and integrated [6,7]. The incorrect and lost 
AIS data will interfere with maritime regulation, leading to 
misjudgment of the maritime situation awareness. Meanwhile, 
it will decrease the effectiveness of analysis on ship motion 
pattern and traffic flow, which is based on the AIS data. 
Therefore, it is necessary to study the correctness and integrity 
of AIS data, and identify the inaccurate AIS data and restore 
the lost AIS data. 


Ma et al. [8,9] conducted studies on the availability of AIS 
information, and proposed approaches for identifying the 
inaccurate AIS data based on the DS evidence theory and 
improved DSmT theory [10,11], however the construction of 
evidence in the proposed methods lack strict reasoning process. 
In order to improve the effectiveness of the evidence, Liu et al. 
[12,13] gave an likelihood based method to construct the 
evidence by statistical analysis on effective prior data samples, 
and proposed ER rule and the PCR6 rule based methods to 
identify the inaccurate AIS data. The evidence based methods 
can give an accurate identification on AIS data, however the 
methods seems inefficient, because too many effective data 
samples should be collected before application. Interpolation 
methods are popular in restoring the time series. Liu et al. [14] 
employed the three spline interpolation method to restore the 
coordinate of lost AIS data. Nguyen [15] utilized piecewise 
cubic Hermite interpolation to restore the trajectory of vessels. 
However, these researches are competent only if the lost AIS 
data are not consecutive, and the restoration of speed and 
course are not in consideration. Liu et al. [16] introduced the 
random algorithm into AIS position restoration, but the 
randomness of the position is not strong enough to fit the actual 
AIS data. 


In this paper, we summarized the categories of inaccurate 
AIS data firstly, and proposed a method to screen the 
inaccurate AIS data. Then, a comparative study was conducted 
to validate the accuracy of improved cubic spline interpolation 
method and piecewise cubic Hermite interpolation. 
Experiments show that the method proposed in this paper can 
correctly screen the inaccurate AIS data and restore the lost 
data effectively. 


Il. 


In order to capture the real time AIS data, an AIS shore 
station was set up (base station model is SAAB R40) in the 
Wuhan section of the Yangtze River. The shore station can 
receive the AIS data from vessels 10 kilometers around. Raw 
AIS data was collected in October 2016, to validate the 
proposed method. According to the raw AIS data, the 
categories of inaccurate AIS data are summarized as follows: 
unreasonable stop during moving forward, impossible high 
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speed, drift track point, unreasonable acceleration and rate of 
turn. It is necessary to screen the raw AIS data to get the 
correct data. In this paper, we designed five rules to screen the 
inaccurate data according to the ship’s maneuverability, as 
shown in follows: 


A. Unreasonalbe stop in moving 


The adjacent AIS data is exactly the same sometimes as the 
vessel is moving ahead. For this type of inaccurate data, the 
cleaning rules are as follows. If the speed of the i-th point is 
higher than 2 knots, but the coordinate, speed, and course are 
the same as the i-lth point, the i-th data should be deleted. As 
shown in equation (1): 


v= NCD + (v) >2 


| lon, —lon,_, |= 0 


Jat, — lat, , |= 0 (1) 


i= 2,3,...,.m—l,m 
i i-l 
| Vion — Kin = 0 


| Viat -vV |= 0 
B. Impossible high speed 
Due to the limitation of power engine and navigational 
rules, the speed of vessels in inland river is less than 16 knot. 
Thus, the correct speed range of AIS data is 0~16 knots, as 
shown in equation (2): 


v= 0v) +0, >16 i=2,3,..,m-lLm (2) 


Any AIS data complies with the formula (2) should be 
deleted. 


C. Drift track point 


When the track point is drift away, the distance between the 
drift point and the adjacent track points will increase. 
Consequently, the average speed will increase as well. If the 
average speed exceeds the maximum speed of the ship, the 
track point will be considered to be the erroneous data. As 
shown in equation (3): 


(k,(x,-x,,)) +(Ky,-y4)Y >16 O 


Where k,(i=1,2) is the factor for transferring coordinate to 
distance, and kı = 96297.6, kı = 111194.9 near latitude 30°N. 
Any track point that complies with equation (3) should be 
deleted. 








D. Unreasonable acceleration 


According to the design specifications of inland ships, the 
distance for a ship to accelerate from zero to the design speed 
is 20 times longer than the length of the ship. When the ship is 
empty, the distance reduced to 1/2~2/3 times of the original 
distance. To obtain the maximum acceleration, the minimum 
distance values to 10 times of the length of ship. Assuming that 
the length of the ship is L, the design speed is Vm, the time 
from the static accelerate to the design speed is tm. The 
travelling distance of the ship can be obtained by formula (4) 
and shown in Fig./. The maximum acceleration of the ship can 
be known from formula (5) according to the distance divides 
velocity transformation formula. 
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T:time 





0 tm 


Fig.l diagram for computing the maximum distance and acceleration 


L, =10xL=0.5xt, xV, (4) 
A OV 
MF 5 
t, 20xm 


Usually, the ship's maximum speed is 16 knot, so the Vm is 
about 8.23 m/s. The length of the ship can be queried from the 
AIS static data. For a ship with a length of 110m, the 
maximum acceleration is 0.03m/s?. It can be seen that the 
maximum acceleration of any cargo ship can be calculated by 
(4), (5), any data beyond the maximum acceleration range 
should be deleted. 


E. Unreasonable rate of turn 


According to the design specification of inland river ships, 
the maximum diameter of a ship can be obtained by the 
formula D = k X L, in which k is the coefficient to measure the 
ship maneuverability, usually for the inland river ship, the 
range of k is in 2~4. The maximum swing diameter represents 
the maximum variation rate of the ship track. As shown in Fig. 
2, when the ship is in constant motion, its path is in the 
direction of the tangent of the trajectory. Assume that ships 
move from Do to Dı, the corresponding time were to and t;, the 
speed v unchanged, the arc angle is w, the maximum rate of 
turn d can be derived by equation (6), (7) and (8). 





wmax 


v 








Fig. 2. Schematic diagram of the maximum rate of turn for a ship entering a 
constant cycle 


D=kxL,k €[2,4] (6) 
f=f,-T Cal 
Al =Í — =j = 
1 to 360 (7) 
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w  360v 


— = 8 
At kaL a 

Where w is the steering angle of the ship, the unit is degree; 
L is the length of the ship, the unit is meter; t represents the 
time, the unit is seconds; v is the ship speed, unit: m/s. For 
example, the length of 110 meters for the ship speed of 8knot, 
the maximum turning rate is 2.1437°/s, where the minimum 
value of k is 2. 


d 


wmax 





After getting rid of the wrong AIS data, we can get the 
correct AIS sequence. However, the original AIS data will be 
partly missing, which needs to be repaired to get a complete 
and correct data. According to the AIS protocol, the time 
interval of class-A ship station does not exceed 10s, the time 
interval of class-B ship station does not exceed 30s. In the 
progress of repairing AIS data, we need to interpolate the data 
according to the time interval, the maximum time interval of 
10s and 30s. 


II. 


After data cleaning, we can obtain the accurate AIS data 
series. For better interpolation, the time, longitude, latitude, 
velocity and course in the AIS data are selected for pre- 
processing, and the raw data is sorted according to the time. 
The latitude and longitude of the starting track point are treated 
as the origin of coordinates. Calculate the time difference, 
speed vector along longitude direction, speed vector along 
latitude direction between the adjacent AIS data. Setting t; as 
time, lon; as longitude, lat; as latitude,v; as speed,ĝ; as course 
angle, Vioni = Vi * cosĝ; is speed vector along longitude, and 
Viati = Vi * sinb; is speed vector along latitude. 


METHOD FOR DATA RESTORATION 


A. Piecewise cubic Hermite interpolation 

Suppose there are n track points in the track sequence. The 
Hermite interpolation interval of longitude for time section 
(ti ti+1) is as follows: 


lon(t,) =a; t +b,t?+ct, +d, (9) 
On the type derivation function on the rate of time, 
dlon(t. 
a= dion) _3 a.ti+2b,t,+¢, (10) 
d(t,) 


In time section (ti,ti+1), the t7,t2,lon7,lon2,Vion1,Vion2 are known, 
take them in (9) and (10),we can get: 


lon(t,) =a,t;+b,t; +c, +d, = lon, 

lon(t,) =a,t,+b,t5+c,t, +d, = lon, aa 
V4 =3a,t,+2b,t,+¢, 

V „> =3a,t5+2b,t, +¢, 


Solving equation (11) to the interval (ti, t2) of the Hermite 
polynomial coefficient a;, b;, ci, dı, so we can get: 


3 2 
lon(t)=a,t'+bt°+ct+d, 


(12) 
= COMO =3at?+ 2b, ttc, 


lon 
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The Eq. (12) is the interpolation polynomial in interval 
(t,,t,). Hence, the longitude and speed along longitude can be 
obtained at any time interval by Eq. (12). Correspondingly, the 
latitude and speed along latitude can be obtained as well. For 
the original AIS sequence, the two Hermite (t;,t;,,) using the 
above equations, we can obtain the corresponding interval of n- 
1 interpolation polynomial like (12). 


After interpolation, we can get the piecewise function about 
the time t. By combining the speed along longitude and latitude, 


and restored speed is V = ./Vign? + Vigt?, and the course can 
be obtained by Eq. (13): 








ar cos(—)(v,.. > 0) 
0= á (13) 
360- ar cos(—)(v,. <0) 
vV 


B. cubic spline interpolation 


After preprocessing the original data, the velocity of the 
longitude and latitude is obtained. Because the speed in the 
direction of longitude and latitude, and the longitude and 
latitude is continuous in time, so we can get a function about 
time for the speed Vion in the direction of longitude, and the 
velocity vj ,; in the direction of latitude respectively. Then, 
the spline function is integrated to obtain the latitude and 
longitude on the speed of the function. So we can obtain 
latitude and longitude, and the speed and course at any time. 
For the AIS sequence, in each sub segment [t;,t;,,], set a three 
times s(t), so: 


— ' eR) = 4 mre — 
S3 (t,) — Yoi ? S3 (to) — Yoni — Aioni’? S3 (t1) — Vion(i+l) _ Gon (i+) 
Order h; = t;,, — t;,and s3(t;) = m,,so we can get: 


t-t. t-t. 
s (t) =p, (Fo Meni tO (Monit th;6( 


L 


t-t. t-t. 
A Dm, + F ie 


L 





In equation (14) g(t) =(t—1)* (2 t—1), so: 
p(t) =t?(-2t+3)@(t) =t (t-1) 
On (14) for the two order, in [t;,t;,,] we have: 


s"(t,) — 6 Vind E Vion = 4m, T 2m, 


2 
h f (15) 
s(t) = 6 a Vioni 4 4m, + 2m, 
h h 
i i 
In order to ensure the continuity of the two derivative, so: 
m_,+2m, 2m,+m,,, 
h, h. 
k (16) 
= 3( Hlen — Vona m Vinai — Vini ) 
2 7 
h h; 
h, 
Order: œ, = ! 
hı +h, 
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b, —3 (1 = a) Vione = Y ma +a Vion(i+l) — Vioni 


h, h, 


l 


Then the expression of (16) is: 
(l—a@,)m,_, +2m, +a,m,,, = P,. 
We know that Mp = Vieno: Mn = Vionn-AIS series can be 
obtained on the m,, Mm, ...M,_1, equations (17): 
/ 
2m, + am, = P,- (1-a Vion 
(l—a@, )m, + 2m, + am, = P, 
(17) 
(l-a@,_,)m,,+2m,,+a@,.m,_, = pr 
/ 
(d = Q n1 )M,2 5 2m, = B,- = QV ionn 


The coefficient matrix is obtained by solving the equations (18). 


2 œ 0 
l=. 2 a, 
A= (18) 
lL=@..» Z A> 
0 l-&,; 2 


After obtaining the coefficient matrix, we can get a 
piecewise spline function solution of Von and Viat on time t. 
Then we can get the corresponding latitude and longitude by 
integrating the velocity in the direction of longitude, we can 
obtain the course through the Eq. (13). 


IV. EXPERIMENT 


In order to validate the proposed restore algorithm, we 
choose 163 AIS data of the ship (call sign HAIYOU668, 
MMSI: 413802276, ship length 120m, Class-A ship station: the 
transmission time interval is 10s) in October 10, 2016 in 
Wuhan. The coordinates, original speed and acceleration, 
original course and rate of turn are shown in Fig. 3, Fig. 4, and 
Fig. 5, respectively. 


30.68 






30.67 | 


30.66 Fo transporte / 
30.65 Baibuting Garden 


2 


) 








= 30.64 þr Em Á % 3 

= goe p 0 / Erau chan anf srg 

J | f 

r 
30.62 + : 
5 lE Gangdu arden Be ' i é 

30.61 / — $ 
30.6 PA X as 
30.5 (ei , Hofgren . 








9 
114.3 114.32 114.34 114.36 


Longitude/(°) 


114.38 114.4 114.42 


Fig. 3. Trajectory of raw AIS data 
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Steering rate/("/s} 


Fig. 5. Original course and rate of turn 


There are too many inaccuraties in the original data, the 
length of the ship is 120 meters, the maximum acceleration is 
0.028 m/s’. In this voyage, the speed is no more than 10 knots, 
the maximum rate of turn is 1.23 °/s. After screening, the 
correct data are shown in Fig.6, Fig.7 and Fig.8, respectively. 
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Fig. 6 AIS trajectory points after cleaning 
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Fig. 8. Course and rate of turn after cleaning 


After data cleaning, 110 correct data are obtained. The 
speed, acceleration, course and rate of turn of the cleaned data 
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are fitted to the actual data, which means the inaccurate data 
has been deleted. 


To verify the validity of the two methods presented in this 
paper, we choose 50 correct AIS data, and delete 5 points 
manually in a row. Then, the deleted track points are restored 
with the remained 45 data, and compared with the original data. 
The results are shown in Fig.9 and Fig. 10. 


14 


114.343 


40.622 
0 360 350 400 420 AAD 30 320 340 | Si 
Time’ Time(s) 


Fig. 9. Repair comparison of Latitude and Longitude 
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Fig. 10. Repair comparison of Speed and Course 

To analyze the accuracy of longitude and latitude, the 
inaccurate of longitude and latitude between restored points 
and original points are transfer into distance. The average value 
and mean square inaccurate of the two kinds of restoration 


methods are shown in Table.1 and Table.2. 
TABLE I. Mean value of inaccurate in two methods 


Distance Course 


Speed 


Hermite 


7.301989 0.089212 -0.73175 
Cubic spline 6.533656 0.078576 -0.91702 


TABLE II. 





The inaccurate variance of the two methods 


Distance Speed Course 


4.004335 0.060271 0.731704 
Cubic spline 2.637348 0.042466 0.751654 


From the tables, we can conclude that the improved cubic 
spline interpolation method performs better. The piecewise 
cubic Hermite interpolation restore the latitude and longitude 
by construct a unique cubic polynomial. The derivative of the 
cubic polynomial about the time is continuous. Hence the 
speed is relatively smooth. But the change of speed wasn’t 
taken into consideration, so the speed is prone to change 
suddenly in each interval. Therefore, the fitting effect is poor. 
On contrary, the derivative of the improved cubic spline 
interpolation about the time is smooth. The position inaccurate 
is about 6.5 m, the speed inaccurate is about 0.07 m/s, the angle 
inaccurate is about 0.91°. 


Hermite 





After cleaning the original AIS data with inaccurate, the 
AIS sequences were restored by the improved cubic spline 
interpolation. The lost data after cleaning is from 100" to 115". 
The experimental results are shown in Fig.//, Fig.J/2 and 
Fig.13, respectively. 
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Fig. 11. Track points before and after restoration 
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Fig. 12. The speed and acceleration after restoration 
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Fig. 13. The course and rate of turn after restoration 


V. 


In this paper, five categories of inaccurate in AIS data are 
presented, and correspondingly, we designed five rules to clean 
the inaccurate according to the ships’ maneuverability. Field 
experiment indicates that the five rules could delete the 
inaccurate data effectively. In order to restore the deleted and 
lost AIS data, piecewise cubic Hermite interpolation and cubic 
spline interpolation are employed. And we use real AIS 
trajectories to validate the accuracy of the two interpolation 
methods. The result shows that the cubic spline interpolation 
can restore the fragmentary AIS data better. 


CONCLUSION 


However, the cubic spline interpolation is effective only 
when the number of continuously lost data is less than five. In 
fact, more data will continuously lost when the ship passing 
through the curve channel and mountainous waterway, or 
encountering with the electromagnetic interference. In this 
paper, we didn’t consider the situation that more than five track 
points are lost. In the future work, we will take this situation 
into consideration. 
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